AITopics | variational information bottleneck

Collaborating Authors

variational information bottleneck

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

e96ed478dab8595a7dbda4cbcbee168f-Supplemental.pdf

Neural Information Processing SystemsFeb-11-2026, 17:08:02 GMT

iclr, learning, representation, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)

Add feedback

WebRouter: Query-specific Router via Variational Information Bottleneck for Cost-sensitive Web Agent

Li, Tao, Hu, Jinlong, Wang, Yang, Liu, Junfeng, Liu, Xuejun

arXiv.org Artificial IntelligenceOct-14-2025

LLM-brained web agents offer powerful capabilities for web automation but face a critical cost-performance trade-off. The challenge is amplified by web agents' inherently complex prompts that include goals, action histories, and environmental states, leading to degraded LLM ensemble performance. To address this, we introduce WebRouter, a novel query-specific router trained from an information-theoretic perspective. Our core contribution is a cost-aware Variational Information Bottleneck (ca-VIB) objective, which learns a compressed representation of the input prompt while explicitly penalizing the expected operational cost. Experiments on five real-world websites from the WebVoyager benchmark show that WebRouter reduces operational costs by a striking 87.8\% compared to a GPT-4o baseline, while incurring only a 3.8\% accuracy drop.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2510.11221

Country: Asia > China (0.69)

Genre: Research Report (0.40)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

e96ed478dab8595a7dbda4cbcbee168f-Supplemental.pdf

Neural Information Processing SystemsOct-9-2025, 16:31:47 GMT

artificial intelligence, machine learning, representation, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)

Add feedback

Reviews: Relevant sparse codes with variational information bottleneck

Neural Information Processing SystemsFeb-11-2025, 20:02:59 GMT

I find the paper novel and interesting. To my knowledge the algorithm is original and it adds to the existing tollbox of IB based approaches. The proposed method seems to outperform Gaussian IB on denoising and occlusion/inpaiting tasks on simulated and real data. It also provides new analysis tools for sparse representations in the form of IB information curves. Overall I think this work has many promising applications in machine learning and neuroscience and would be of interest to the NIPS audience.

brief justification, relevant sparse code, variational information bottleneck, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.58)

Add feedback

A Distance Metric Learning Model Based On Variational Information Bottleneck

Zhang, YaoDan, Wang, Zidong, Jia, Ru, Li, Ru

arXiv.org Artificial IntelligenceMar-5-2024

In recent years, personalized recommendation technology has flourished and become one of the hot research directions. The matrix factorization model and the metric learning model which proposed successively have been widely studied and applied. The latter uses the Euclidean distance instead of the dot product used by the former to measure the latent space vector. While avoiding the shortcomings of the dot product, the assumption of Euclidean distance is neglected, resulting in limited recommendation quality of the model. In order to solve this problem, this paper combines the Variationl Information Bottleneck with metric learning model for the first time, and proposes a new metric learning model VIB-DML (Variational Information Bottleneck Distance Metric Learning) for rating prediction, which limits the mutual information of the latent space feature vector to improve the robustness of the model and satisfiy the assumption of Euclidean distance by decoupling the latent space feature vector. In this paper, the experimental results are compared with the root mean square error (RMSE) on the three public datasets. The results show that the generalization ability of VIB-DML is excellent. Compared with the general metric learning model MetricF, the prediction error is reduced by 7.29%. Finally, the paper proves the strong robustness of VIB-DML through experiments.

euclidean distance, vector, vib-dml, (15 more...)

arXiv.org Artificial Intelligence

2403.02794

Country:

North America > United States > New York (0.05)
Asia > Mongolia (0.05)
Asia > China > Inner Mongolia > Hohhot (0.04)
(3 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Flexible Variational Information Bottleneck: Achieving Diverse Compression with a Single Training

Kudo, Sota, Ono, Naoaki, Kanaya, Shigehiko, Huang, Ming

arXiv.org Artificial IntelligenceFeb-2-2024

Information Bottleneck (IB) is a widely used framework that enables the extraction of information related to a target random variable from a source random variable. In the objective function, IB controls the trade-off between data compression and predictiveness through the Lagrange multiplier $\beta$. Traditionally, to find the trade-off to be learned, IB requires a search for $\beta$ through multiple training cycles, which is computationally expensive. In this study, we introduce Flexible Variational Information Bottleneck (FVIB), an innovative framework for classification task that can obtain optimal models for all values of $\beta$ with single, computationally efficient training. We theoretically demonstrate that across all values of reasonable $\beta$, FVIB can simultaneously maximize an approximation of the objective function for Variational Information Bottleneck (VIB), the conventional IB method. Then we empirically show that FVIB can learn the VIB objective as effectively as VIB. Furthermore, in terms of calibration performance, FVIB outperforms other IB and calibration methods by enabling continuous optimization of $\beta$. Our codes are available at https://github.com/sotakudo/fvib.

approximation, equation, vib objective, (12 more...)

arXiv.org Artificial Intelligence

2402.01238

Country:

North America > United States > District of Columbia > Washington (0.04)
Asia > Vietnam (0.04)

Genre: Research Report > New Finding (0.87)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)

Add feedback

Contrastive variational information bottleneck for aspect-based sentiment analysis

Chang, Mingshan, Yang, Min, Jiang, Qingshan, Xu, Ruifeng

arXiv.org Artificial IntelligenceDec-21-2023

Deep learning techniques have dominated the literature on aspect-based sentiment analysis (ABSA), achieving state-of-the-art performance. However, deep models generally suffer from spurious correlations between input features and output labels, which hurts the robustness and generalization capability by a large margin. In this paper, we propose to reduce spurious correlations for ABSA, via a novel Contrastive Variational Information Bottleneck framework (called CVIB). The proposed CVIB framework is composed of an original network and a self-pruned network, and these two networks are optimized simultaneously via contrastive learning. Concretely, we employ the Variational Information Bottleneck (VIB) principle to learn an informative and compressed network (self-pruned network) from the original network, which discards the superfluous patterns or spurious correlations between input features and prediction labels. Then, self-pruning contrastive learning is devised to pull together semantically similar positive pairs and push away dissimilar pairs, where the representations of the anchor learned by the original and self-pruned networks respectively are regarded as a positive pair while the representations of two different sentences within a mini-batch are treated as a negative pair. To verify the effectiveness of our CVIB method, we conduct extensive experiments on five benchmark ABSA datasets and the experimental results show that our approach achieves better performance than the strong competitors in terms of overall prediction performance, robustness, and generalization. Code and data to reproduce the results in this paper is available at: https://github.com/shesshan/CVIB.

aclanthology, computational linguistic, proceedings, (12 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.knosys.2023.111302

2303.02846

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Austria > Vienna (0.14)
Asia > China > Guangdong Province > Shenzhen (0.05)
(16 more...)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.88)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.88)

Add feedback

Pluvio: Assembly Clone Search for Out-of-domain Architectures and Libraries through Transfer Learning and Conditional Variational Information Bottleneck

Fu, Zhiwei, Ding, Steven H. H., Alaca, Furkan, Fung, Benjamin C. M., Charland, Philippe

arXiv.org Artificial IntelligenceJul-20-2023

The practice of code reuse is crucial in software development for a faster and more efficient development lifecycle. In reality, however, code reuse practices lack proper control, resulting in issues such as vulnerability propagation and intellectual property infringements. Assembly clone search, a critical shift-right defence mechanism, has been effective in identifying vulnerable code resulting from reuse in released executables. Recent studies on assembly clone search demonstrate a trend towards using machine learning-based methods to match assembly code variants produced by different toolchains. However, these methods are limited to what they learn from a small number of toolchain variants used in training, rendering them inapplicable to unseen architectures and their corresponding compilation toolchain variants. This paper presents the first study on the problem of assembly clone search with unseen architectures and libraries. We propose incorporating human common knowledge through large-scale pre-trained natural language models, in the form of transfer learning, into current learning-based approaches for assembly clone search. Transfer learning can aid in addressing the limitations of the existing approaches, as it can bring in broader knowledge from human experts in assembly code. We further address the sequence limit issue by proposing a reinforcement learning agent to remove unnecessary and redundant tokens. Coupled with a new Variational Information Bottleneck learning strategy, the proposed system minimizes the reliance on potential indicators of architectures and optimization settings, for a better generalization of unseen architectures. We simulate the unseen architecture clone search scenarios and the experimental results show the effectiveness of the proposed approach against the state-of-the-art solutions.

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2307.10631

Country:

North America > Canada > Ontario > Kingston (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > Canada > Ontario > National Capital Region > Ottawa (0.14)
(12 more...)

Genre: Research Report > Promising Solution (1.00)

Industry: Information Technology > Security & Privacy (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.81)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

Add feedback

Anti-Spoofing Using Transfer Learning with Variational Information Bottleneck

Eom, Youngsik, Lee, Yeonghyeon, Um, Ji Sub, Kim, Hoirin

arXiv.org Artificial IntelligenceDec-14-2022

Recent advances in sophisticated synthetic speech generated from text-to-speech (TTS) or voice conversion (VC) systems cause threats to the existing automatic speaker verification (ASV) systems. Since such synthetic speech is generated from diverse algorithms, generalization ability with using limited training data is indispensable for a robust anti-spoofing system. In this work, we propose a transfer learning scheme based on the wav2vec 2.0 pretrained model with variational information bottleneck (VIB) for speech anti-spoofing task. Evaluation on the ASVspoof 2019 logical access (LA) database shows that our method improves the performance of distinguishing unseen spoofed and genuine speech, outperforming current state-of-the-art anti-spoofing systems. Furthermore, we show that the proposed system improves performance in low-resource and cross-dataset settings of anti-spoofing task significantly, demonstrating that our system is also robust in terms of data size and data distribution.

artificial intelligence, machine learning, speech, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.21437/Interspeech.2022-10200

2204.01387

Country: Asia > South Korea > Daejeon > Daejeon (0.04)

Genre: Research Report (0.50)

Industry: Information Technology > Security & Privacy (0.48)

Technology:

Information Technology > Artificial Intelligence > Speech (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.62)

Add feedback

A Variational AutoEncoder for Transformers with Nonparametric Variational Information Bottleneck

Henderson, James, Fehr, Fabio

arXiv.org Artificial IntelligenceAug-12-2022

Attention-based deep learning models, such as Transformers (Vaswani et al., 2017; Devlin et al., 2019), have achieved unprecedented empirical success in a wide range of cognitive tasks, in particular in natural language processing (NLP). On the other hand, deep variational Bayesian approaches to representation learning, such as variational autoencoders (VAEs) (Kingma and Welling, 2014), have also been very influential, especially due to their variational information bottleneck (VIB) (Alemi et al., 2017; Kingma and Welling, 2014) for regularising the induced latent representations. Previous VIB methods only apply to a vector space, and Transformers crucially do not use a single vector as their latent representation, instead using a set of vectors (Lin et al., 2020; Fang et al., 2021; Park and Lee, 2021). This allows the number of vectors in a Transformer embedding to grow with the size of the input, which is essential for embedding natural language text (Bahdanau et al., 2015), where the size of the input can range from a single word to thousands of words. In this paper, we propose a variational information bottleneck regulariser for set-of-vector latent representations, and use it to regularise the induced latent representation of a Transformer encoder-decoder variational autoencoder.

posterior, representation, vector, (14 more...)

arXiv.org Artificial Intelligence

2207.13529

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.28)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Middle East > Jordan (0.04)
(11 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback